Skip to content

Expose error model sparsification to python module, update sinter decoders dict#257

Open
noajshu wants to merge 13 commits into
mainfrom
sparsify-errors-python
Open

Expose error model sparsification to python module, update sinter decoders dict#257
noajshu wants to merge 13 commits into
mainfrom
sparsify-errors-python

Conversation

@noajshu

@noajshu noajshu commented May 27, 2026

Copy link
Copy Markdown
Contributor

Exposes the per-shot error model sparsification functionality added in #254 to the Python module and Sinter compatibility layer.

Changes:

  • Added sparsification config fields for Python: sparsify_errors, sparsify_base_degree, sparsify_max_degree, and sparsify_reactivate_limit.
  • Added module-level tesseract.suggest_sparsify_reactivate_limit(num_detectors, sparsify_base_degree).
  • Resolves sparsify_reactivate_limit == -1 only when preparing the decoder.
  • Added validation for sparsification parameters in decoder initialization.
  • Kept CLI sparsification semantics explicit: user-provided --sparsify-reactivate-limit is preserved, while omitted M is stored as -1 until decoder initialization.
  • Changed the C++ CLI default detector ordering to DetIndex when no --det-order-* method is specified. --det-order-bfs remains available for the old BFS behavior.
  • Extended TesseractSinterDecoder to support sparsification config while preserving pickle/serialization compatibility for existing distributed Sinter runs.
  • Registered new sparsified decoders in the Sinter decoders dictionary: tesseract-long-beam-sparsify3, tesseract-long-beam-sparsify2, tesseract-short-beam-sparsify3, and tesseract-short-beam-sparsify2.
  • Added Python and C++ unit tests for sparsify config, validation, heuristic suggestion, Sinter propagation, and end-to-end decoding.

Verification:

  • bazelisk build --jobs=1 --local_resources=cpu=1 src:tesseract
  • bazelisk test --jobs=1 --local_resources=cpu=1 //src:tesseract_tests

noajshu added 2 commits May 26, 2026 21:56
Exposes the per-shot error model sparsification functionality (added in PR 254)
to the Python module and Sinter compatibility layer.

Changes:
- Moved the sparsify reactivate limit (M) heuristic calculation from the CLI
  to `TesseractConfig::get_sparsify_reactivate_limit()` in C++ core.
- Added validation for sparsification parameters in decoder initialization.
- Simplified CLI sparsification parameter resolution.
- Bound sparsification parameters (`sparsify_errors`, `sparsify_base_degree`,
  `sparsify_max_degree`, `sparsify_reactivate_limit`) and the heuristic
  resolution method in `TesseractConfig` Python bindings.
- Extended `TesseractSinterDecoder` to support sparsification config,
  maintaining pickling/serialization compatibility for distributed Sinter runs.
- Registered new sparsified decoders in Sinter decoders dictionary:
  `tesseract-long-beam-sparsify3`, `tesseract-long-beam-sparsify2`,
  `tesseract-short-beam-sparsify3`, and `tesseract-short-beam-sparsify2`.
- Added comprehensive Python unit tests for core sparsify config, validation,
  and Sinter end-to-end decoding.
- Fixed pre-existing CMake build failure under strict compilers by upgrading
  external HiGHS dependency to v1.14.0.

TAG=agy
CONV=365755ec-6af0-4d9c-a7e2-4a363289b763
@noajshu noajshu changed the title feat(python): expose error model sparsification and add sinter decoders Expose error model sparsification to python module, update sinter decoders dict May 27, 2026
@noajshu noajshu marked this pull request as ready for review June 10, 2026 16:38
@noajshu noajshu requested a review from a team as a code owner June 10, 2026 16:38
@noajshu noajshu requested review from LalehB, aria-googler and mhucka and removed request for a team June 10, 2026 16:38

@mhucka mhucka left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not qualified to evaluate the algorithmic or theoretical correctness, but I tried to look at this from the a basic code perspective. I had a couple of minor items, noted in-line.

In addition, one thing that I don't see (and this may just be due to ignorance of Tesseract Decoder) is an update to the .pyi files. This PR does seem to introduce new properties on some objects, which suggests the .pyi files may become out of date. Not 100% sure – just flagging it for your attention.

Comment thread src/py/README.md
Comment thread src/py/tesseract_test.py Outdated
Comment thread src/tesseract.cc Outdated
Comment thread src/tesseract_main.cc

@LalehB LalehB left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment thread src/py/README.md
`tesseract-short-beam-sparsify3`, and `tesseract-short-beam-sparsify2`.

As a quick rule of thumb, use the non-sparsified decoders as the safest baseline. Use `sparsify2`
for surface-code-like or mostly graphlike DEMs, and use `sparsify3` for color-code,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: wdyt about using a name like sparsify-surface-code-like instead of sparcify2? I know it becomes longer which is not the best but something more descriptive might be good.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea. Let me think and ask some others about the naming as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants